pacman::p_load(sf, spNetwork, tmap, tidyverse)In-class Exercise 3: Advanced Spatial Point Patterns Analysis
1 Overview
Network constrained Spatial Point Pattern Analysis (NetSPAA) is a collection of specialized methods designed for analyzing spatial point events that occur on or alongside a network. These events could include, for example, the locations of traffic accidents or childcare centers, while the network itself might represent road systems, river networks, or other linear infrastructures.
In this in-class exercise, we will learn practical experience with the key functions of the spNetwork package, specifically:
- Deriving Network Kernel Density Estimation (NKDE)
- Performing Network-based G-function and K-function analysis
2 The data
In this exercise, we will analyse the spatial distribution of childcare center in Punggol planning area using following geospatial datasets:
| Dataset | Description | Format |
|---|---|---|
| Punggol_St | Line features geospatial data which store the road network within Punggol Planning Area. | ESRI shapefile |
| Punggol_CC | Point features geospatial data which store the location of childcare centres within Punggol Planning Area. | ESRI shapefile |
3 Installing and launching the R packages
We will use following packages in this exercise:
| Package | Description |
|---|---|
| sf | Provides functions to manage, processing, and manipulate Simple Features, a formal geospatial data standard that specifies a storage and access model of spatial geometries such as points, lines, and polygons. |
| spNetwork | Provides functions to perform Spatial Point Patterns Analysis such as kernel density estimation (KDE) and K-function on network. It also can be used to build spatial matrices (‘listw’ objects like in ‘spdep’ package) to conduct any kind of traditional spatial analysis with spatial weights based on reticular distances. |
| tidyverse | Provides collection of functions for performing data science task such as importing, tidying, wrangling data and visualising data. |
| tmap | Provides functions for plotting cartographic quality static point patterns maps or interactive maps by using leaflet API |
To install and launch the four R packages.
4 Data Import and Preparation
4.1 Import
The code chunk below uses st_read() of sf package to important Punggol_St and Punggol_CC geospatial data sets into RStudio as sf data frames.
network <- st_read(dsn="data/geospatial",
layer="Punggol_St")Reading layer `Punggol_St' from data source
`/Users/cham/project/Geospatial-Analytics/chrismanafe/ISSS626-GAA/in_class_ex/in_class_ex03/data/geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 2642 features and 2 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 34038.56 ymin: 40941.11 xmax: 38882.85 ymax: 44801.27
Projected CRS: SVY21 / Singapore TM
childcare <- st_read(dsn="data/geospatial",
layer="Punggol_CC") %>%
st_zm(drop = T,
what = "ZM")Reading layer `Punggol_CC' from data source
`/Users/cham/project/Geospatial-Analytics/chrismanafe/ISSS626-GAA/in_class_ex/in_class_ex03/data/geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 61 features and 1 field
Geometry type: POINT
Dimension: XYZ
Bounding box: xmin: 34423.98 ymin: 41503.6 xmax: 37619.47 ymax: 44685.77
z_range: zmin: 0 zmax: 0
Projected CRS: SVY21 / Singapore TM
We noticed that geometry attribute of childcare has 3D point, but we require 2D points for NKDE function. So we need to drop the Z-dimension of geometry attribute.
4.2 Examine data structure
childcareSimple feature collection with 61 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 34423.98 ymin: 41503.6 xmax: 37619.47 ymax: 44685.77
Projected CRS: SVY21 / Singapore TM
First 10 features:
Name geometry
1 kml_10 POINT (36173.81 42550.33)
2 kml_99 POINT (36479.56 42405.21)
3 kml_100 POINT (36618.72 41989.13)
4 kml_101 POINT (36285.37 42261.42)
5 kml_122 POINT (35414.54 42625.1)
6 kml_161 POINT (36545.16 42580.09)
7 kml_172 POINT (35289.44 44083.57)
8 kml_188 POINT (36520.56 42844.74)
9 kml_205 POINT (36924.01 41503.6)
10 kml_222 POINT (37141.76 42326.36)
st_crs(childcare)Coordinate Reference System:
User input: SVY21 / Singapore TM
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
networkSimple feature collection with 2642 features and 2 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 34038.56 ymin: 40941.11 xmax: 38882.85 ymax: 44801.27
Projected CRS: SVY21 / Singapore TM
First 10 features:
LINK_ID ST_NAME geometry
1 116130894 PUNGGOL RD LINESTRING (36546.89 44574....
2 116130897 PONGGOL TWENTY-FOURTH AVE LINESTRING (36546.89 44574....
3 116130901 PONGGOL SEVENTEENTH AVE LINESTRING (36012.73 44154....
4 116130902 PONGGOL SEVENTEENTH AVE LINESTRING (36062.81 44197....
5 116130907 PUNGGOL CENTRAL LINESTRING (36131.85 42755....
6 116130908 PUNGGOL RD LINESTRING (36112.93 42752....
7 116130909 PUNGGOL CENTRAL LINESTRING (36127.4 42744.5...
8 116130910 PUNGGOL FLD LINESTRING (35994.98 42428....
9 116130911 PUNGGOL FLD LINESTRING (35984.97 42407....
10 116130912 EDGEFIELD PLNS LINESTRING (36200.87 42219....
st_crs(network)Coordinate Reference System:
User input: SVY21 / Singapore TM
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
5 Visualising the Geospatial Data
We also use st_geometry() to focus purely on the spatial geometries of the object (points, lines, polygons).
plot(st_geometry(network))
plot(childcare,add=T,col='red',pch = 18)
tmap_mode('view')tmap mode set to interactive viewing
tm_shape(childcare) +
tm_dots(col = 'red') +
tm_shape(network) +
tm_lines()tmap_mode('plot')tmap mode set to plotting
6 Visualising the Geospatial Data
plot(st_geometry(network))
plot(childcare,add=T,col='red',pch = 19)
tmap_mode('view')tmap mode set to interactive viewing
tm_shape(childcare) +
tm_dots(col = 'red') +
tm_shape(network) +
tm_lines()tmap mode set to plotting
7 Network Kernel Density Estimate (NKDE) Analysis
7.1 Preparing the lixels object
Before computing NKDE, the SpatialLines object need to be split into lixels with a specified minimal distance. This operation can be accomplished using lixelize_lines() function from the spNetwork package.
lixels <- lixelize_lines(network,
700,
mindist = 350)In above code, we set the length of a lixel to 700m and set the minimum length of a lixel to 350m.
Additional Notes:
- If the final lixel after splitting is shorter than the specified minimal distance, it is combined with preceding lixel.
- If the minimum distance is not specified (
NULL), it defaults to \(\frac{\text{maxdist}}{10}\). - If the segments that are already shorter than the minimum distance are not modified.
- Another function,
lixelize_lines.mc(), is available, which provides multicore support for processing.
7.2 Generate line centre points
lines_center() of spNetwork is used to generate a SpatialPointsDataFrame with line centre points. The points are located at center of the line based on the length of the line.
samples <- lines_center(lixels) 7.3 Visualize line center points
tmap_mode("view")tmap mode set to interactive viewing
tm_shape(lixels) +
tm_lines() +
tm_shape(samples) +
tm_dots(size = 0.01)tmap_mode("plot")tmap mode set to plotting
7.4 Performing NKDE
To compute NKDE:
densities <- nkde(network,
events = childcare,
w = rep(1, nrow(childcare)),
samples = samples,
kernel_name = "quartic",
bw = 300,
div= "bw",
method = "simple",
digits = 1,
tol = 1,
grid_shape = c(1,1),
max_depth = 8,
agg = 5,
sparse = TRUE,
verbose = FALSE)It is recommended to read the documentation of nkde() function to understand various parameters too calibrate NKDE model.
7.5 Visualising NKDE
7.5.1 Insert computed density values into samples and lixels object as density field
samples$density <- densities
lixels$density <- densities7.6 Rescale density values
Since svy21 projection system is in meter, the computed density values are very small i.e. 0.0000005. We can rescale the density values from number of events per meter to number of events per kilometer using following code:
# rescaling to help the mapping
samples$density <- samples$density*1000
lixels$density <- lixels$density*10007.7 Visualise using tmap package
We use tmap package to prepare interactive and high cartographic quality map visualisation.
tmap_mode('view')tmap mode set to interactive viewing
tm_shape(lixels)+
tm_lines(col="density")+
tm_shape(childcare)+
tm_dots()tmap_mode('plot')tmap mode set to plotting
Road segments with relatively higher density of childcare centres is highlighted with darker color. Road segments with relatively lower density of childcare centres is highlighted with lighter color.
8 Network Constrained G- and K-Function Analysis
We will perform a Complete Spatial Randomness (CSR) test using the kfunctions() function from the spNetwork package.
Null Hypothesis
H₀: The observed spatial point events (i.e., the distribution of childcare centers) are uniformly distributed over the street network in the Punggol Planning Area.
CSR Test Assumptions
- The CSR test is based on the binomial point process assumption.
- This implies that childcare centers are randomly and independently distributed across the street network.
Interpretation of Results
- If the null hypothesis is rejected, we may infer that the distribution of childcare centers is not random.
- This suggests that the centers are spatially interacting and dependent on one another, potentially forming non-random patterns.
kfun_childcare <- kfunctions(network,
childcare,
start = 0,
end = 1000,
step = 50,
width = 50,
nsim = 49,
resolution = 50,
verbose = FALSE,
conf_int = 0.05)| Argument | Description |
|---|---|
lines |
A SpatialLinesDataFrame containing the sampling points. The geometries must be valid; invalid geometries may cause crashes. |
points |
A SpatialPointsDataFrame representing the points on the network, which will be snapped onto the network. |
start |
A double indicating the starting value for evaluating the k and g functions. |
end |
A double specifying the final value for evaluating the k and g functions. |
step |
A double representing the step size or interval between two evaluations of the k and g functions. |
width |
The width of each donut ring for the g-function. |
nsim |
An integer specifying the number of Monte Carlo simulations to run. For example, 50 simulations were performed in the given code chunk. Usually, more simulations are required for robust inference. |
resolution |
The resolution for simulating random points on the network. A lower resolution reduces calculation time. If NULL, random points are placed anywhere on the network. Specifying a value splits edges, and random points are selected from vertices. |
conf_int |
A double representing the width of the confidence interval, with a default value of 0.05. |
The output of kfunctions() is a list with the following values:
- plotk, a ggplot2 object representing the values of the k-function
- plotg, a ggplot2 object representing the values of the g-function
- values, a DataFrame with the values used to build the plots
Visualise ggplot2 object of k-function
kfun_childcare$plotk
9 References
To-be-updated